In this notebook, we're going to define and train a CycleGAN to read in an image from a set $X$ and transform it so that it looks as if it belongs in set $Y$. Specifically, we'll look at a set of images of Yosemite national park taken either during the summer of winter. The seasons are our two domains!
The objective will be to train generators that learn to transform an image from domain $X$ into an image that looks like it came from domain $Y$ (and vice versa).
Some examples of image data in both sets are pictured below.

These images do not come with labels, but CycleGANs give us a way to learn the mapping between one image domain and another using an unsupervised approach. A CycleGAN is designed for image-to-image translation and it learns from unpaired training data. This means that in order to train a generator to translate images from domain $X$ to domain $Y$, we do not have to have exact correspondences between individual images in those domains. For example, in the paper that introduced CycleGANs, the authors are able to translate between images of horses and zebras, even though there are no images of a zebra in exactly the same position as a horse or with exactly the same background, etc. Thus, CycleGANs enable learning a mapping from one domain $X$ to another domain $Y$ without having to find perfectly-matched, training pairs!

A CycleGAN is made of two types of networks: discriminators, and generators. In this example, the discriminators are responsible for classifying images as real or fake (for both $X$ and $Y$ kinds of images). The generators are responsible for generating convincing, fake images for both kinds of images.
This notebook will detail the steps you should take to define and train such a CycleGAN.
- You'll load in the image data using PyTorch's DataLoader class to efficiently read in images from a specified directory.
- Then, you'll be tasked with defining the CycleGAN architecture according to provided specifications. You'll define the discriminator and the generator models.
- You'll complete the training cycle by calculating the adversarial and cycle consistency losses for the generator and discriminator network and completing a number of training epochs. It's suggested that you enable GPU usage for training.
- Finally, you'll evaluate your model by looking at the loss over time and looking at sample, generated images.
We'll first load in and visualize the training data, importing the necessary libraries to do so.
# !unzip summer2winter_yosemite.zip # can comment out after executing once
# loading in and transforming data
import os
import torch
from torch.utils.data import DataLoader
import torchvision
import torchvision.datasets as datasets
import torchvision.transforms as transforms
# visualizing data
import matplotlib.pyplot as plt
import numpy as np
import warnings
%matplotlib inline
The get_data_loader function returns training and test DataLoaders that can load data efficiently and in specified batches. The function has the following parameters:
image_type: summer or winter, the names of the directories where the X and Y images are storedimage_dir: name of the main image directory, which holds all training and test imagesimage_size: resized, square image dimension (all images will be resized to this dim)batch_size: number of images in one batch of dataThe test data is strictly for feeding to our generators, later on, so we can visualize some generated samples on fixed, test data.
You can see that this function is also responsible for making sure our images are of the right, square size (128x128x3) and converted into Tensor image types.
It's suggested that you use the default values of these parameters.
Note: If you are trying this code on a different set of data, you may get better results with larger image_size and batch_size parameters. If you change the batch_size, make sure that you create complete batches in the training loop otherwise you may get an error when trying to save sample data.
def get_data_loader(image_type, image_dir='summer2winter_yosemite',
image_size=128, batch_size=16, num_workers=0):
"""Returns training and test data loaders for a given image type, either 'summer' or 'winter'.
These images will be resized to 128x128x3, by default, converted into Tensors, and normalized.
"""
# resize and normalize the images
transform = transforms.Compose([transforms.Resize(image_size), # resize to 128x128
transforms.ToTensor()])
# get training and test directories
image_path = './' + image_dir
train_path = os.path.join(image_path, image_type)
test_path = os.path.join(image_path, 'test_{}'.format(image_type))
# define datasets using ImageFolder
train_dataset = datasets.ImageFolder(train_path, transform)
test_dataset = datasets.ImageFolder(test_path, transform)
# create and return DataLoaders
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers)
return train_loader, test_loader
# Create train and test dataloaders for images from the two domains X and Y
# image_type = directory names for our data
dataloader_X, test_dataloader_X = get_data_loader(image_type='summer')
dataloader_Y, test_dataloader_Y = get_data_loader(image_type='winter')
Below we provide a function imshow that reshape some given images and converts them to NumPy images so that they can be displayed by plt. This cell should display a grid that contains a batch of image data from set $X$.
# helper imshow function
def imshow(img):
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# get some images from X
dataiter = iter(dataloader_X)
# the "_" is a placeholder for no labels
images, _ = dataiter.next()
# show images
fig = plt.figure(figsize=(12, 8))
imshow(torchvision.utils.make_grid(images))
Next, let's visualize a batch of images from set $Y$.
# get some images from Y
dataiter = iter(dataloader_Y)
images, _ = dataiter.next()
# show images
fig = plt.figure(figsize=(12,8))
imshow(torchvision.utils.make_grid(images))
We need to do a bit of pre-processing; we know that the output of our tanh activated generator will contain pixel values in a range from -1 to 1, and so, we need to rescale our training images to a range of -1 to 1. (Right now, they are in a range from 0-1.)
# current range
img = images[0]
print('Min: ', img.min())
print('Max: ', img.max())
Min: tensor(1.00000e-02 *
1.9608)
Max: tensor(0.9373)
# helper scale function
def scale(x, feature_range=(-1, 1)):
''' Scale takes in an image x and returns that image, scaled
with a feature_range of pixel values from -1 to 1.
This function assumes that the input x is already scaled from 0-255.'''
# scale from 0-1 to feature_range
min, max = feature_range
x = x * (max - min) + min
return x
# scaled range
scaled_img = scale(img)
print('Scaled min: ', scaled_img.min())
print('Scaled max: ', scaled_img.max())
Scaled min: tensor(-0.9608) Scaled max: tensor(0.8745)
A CycleGAN is made of two discriminator and two generator networks.
The discriminators, $D_X$ and $D_Y$, in this CycleGAN are convolutional neural networks that see an image and attempt to classify it as real or fake. In this case, real is indicated by an output close to 1 and fake as close to 0. The discriminators have the following architecture:

This network sees a 128x128x3 image, and passes it through 5 convolutional layers that downsample the image by a factor of 2. The first four convolutional layers have a BatchNorm and ReLu activation function applied to their output, and the last acts as a classification layer that outputs one value.
To define the discriminators, you're expected to use the provided conv function, which creates a convolutional layer + an optional batch norm layer.
import torch.nn as nn
import torch.nn.functional as F
# helper conv function
def conv(in_channels, out_channels, kernel_size, stride=2, padding=1, batch_norm=True):
"""Creates a convolutional layer, with optional batch normalization.
"""
layers = []
conv_layer = nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
kernel_size=kernel_size, stride=stride, padding=padding, bias=False)
layers.append(conv_layer)
if batch_norm:
layers.append(nn.BatchNorm2d(out_channels))
return nn.Sequential(*layers)
Your task is to fill in the __init__ function with the specified 5 layer conv net architecture. Both $D_X$ and $D_Y$ have the same architecture, so we only need to define one class, and later instantiate two discriminators.
It's recommended that you use a kernel size of 4x4 and use that to determine the correct stride and padding size for each layer. This Stanford resource may also help in determining stride and padding sizes.
__init__The forward function defines how an input image moves through the discriminator, and the most important thing is to pass it through your convolutional layers in order, with a ReLu activation function applied to all but the last layer.
You should not apply a sigmoid activation function to the output, here, and that is because we are planning on using a squared error loss for training. And you can read more about this loss function, later in the notebook.
class Discriminator(nn.Module):
def __init__(self, conv_dim=64):
super(Discriminator, self).__init__()
# Define all convolutional layers
# Should accept an RGB image as input and output a single value
# Convolutional layers, increasing in depth
# first layer has *no* batchnorm
self.conv1 = conv(3, conv_dim, 4, batch_norm=False) # x, y = 64, depth 64
self.conv2 = conv(conv_dim, conv_dim*2, 4) # (32, 32, 128)
self.conv3 = conv(conv_dim*2, conv_dim*4, 4) # (16, 16, 256)
self.conv4 = conv(conv_dim*4, conv_dim*8, 4) # (8, 8, 512)
# Classification layer
self.conv5 = conv(conv_dim*8, 1, 8, stride=1, padding=0, batch_norm=False)
def forward(self, x):
# relu applied to all conv layers but last
out = F.relu(self.conv1(x))
out = F.relu(self.conv2(out))
out = F.relu(self.conv3(out))
out = F.relu(self.conv4(out))
# last, classification layer
out = self.conv5(out)
return out
The generators, G_XtoY and G_YtoX (sometimes called F), are made of an encoder, a conv net that is responsible for turning an image into a smaller feature representation, and a decoder, a transpose_conv net that is responsible for turning that representation into an transformed image. These generators, one from XtoY and one from YtoX, have the following architecture:

This network sees a 128x128x3 image, compresses it into a feature representation as it goes through three convolutional layers and reaches a series of residual blocks. It goes through a few (typically 6 or more) of these residual blocks, then it goes through three transpose convolutional layers (sometimes called de-conv layers) which upsample the output of the resnet blocks and create a new image!
Note that most of the convolutional and transpose-convolutional layers have BatchNorm and ReLu functions applied to their outputs with the exception of the final transpose convolutional layer, which has a tanh activation function applied to the output. Also, the residual blocks are made of convolutional and batch normalization layers, which we'll go over in more detail, next.
To define the generators, you're expected to define a ResidualBlock class which will help you connect the encoder and decoder portions of the generators. You might be wondering, what exactly is a Resnet block? It may sound familiar from something like ResNet50 for image classification, pictured below.

ResNet blocks rely on connecting the output of one layer with the input of an earlier layer. The motivation for this structure is as follows: very deep neural networks can be difficult to train. Deeper networks are more likely to have vanishing or exploding gradients and, therefore, have trouble reaching convergence; batch normalization helps with this a bit. However, during training, we often see that deep networks respond with a kind of training degradation. Essentially, the training accuracy stops improving and gets saturated at some point during training. In the worst cases, deep models would see their training accuracy actually worsen over time!
One solution to this problem is to use Resnet blocks that allow us to learn so-called residual functions as they are applied to layer inputs. You can read more about this proposed architecture in the paper, Deep Residual Learning for Image Recognition by Kaiming He et. al, and the below image is from that paper.

Usually, when we create a deep learning model, the model (several layers with activations applied) is responsible for learning a mapping, M, from an input x to an output y.
M(x) = y(Equation 1)
Instead of learning a direct mapping from x to y, we can instead define a residual function
F(x) = M(x) - x
This looks at the difference between a mapping applied to x and the original input, x. F(x) is, typically, two convolutional layers + normalization layer and a ReLu in between. These convolutional layers should have the same number of inputs as outputs. This mapping can then be written as the following; a function of the residual function and the input x. The addition step creates a kind of loop that connects the input x to the output, y:
M(x) = F(x) + x(Equation 2) or
y = F(x) + x(Equation 3)
The idea is that it is easier to optimize this residual function F(x) than it is to optimize the original mapping M(x). Consider an example; what if we want y = x?
From our first, direct mapping equation, Equation 1, we could set M(x) = x but it is easier to solve the residual equation F(x) = 0, which, when plugged in to Equation 3, yields y = x.
ResidualBlock Class¶To define the ResidualBlock class, we'll define residual functions (a series of layers), apply them to an input x and add them to that same input. This is defined just like any other neural network, with an __init__ function and the addition step in the forward function.
In our case, you'll want to define the residual block as:
Then, in the forward function, add the input x to this residual block. Feel free to use the helper conv function from above to create this block.
# residual block class
class ResidualBlock(nn.Module):
"""Defines a residual block.
This adds an input x to a convolutional layer (applied to x) with the same size input and output.
These blocks allow a model to learn an effective transformation from one domain to another.
"""
def __init__(self, conv_dim):
super(ResidualBlock, self).__init__()
# conv_dim = number of inputs
# define two convolutional layers + batch normalization that will act as our residual function, F(x)
# layers should have the same shape input as output; I suggest a kernel_size of 3
self.conv_layer1 = conv(in_channels=conv_dim, out_channels=conv_dim,
kernel_size=3, stride=1, padding=1, batch_norm=True)
self.conv_layer2 = conv(in_channels=conv_dim, out_channels=conv_dim,
kernel_size=3, stride=1, padding=1, batch_norm=True)
def forward(self, x):
# apply a ReLu activation the outputs of the first layer
# return a summed output, x + resnet_block(x)
out_1 = F.relu(self.conv_layer1(x))
out_2 = x + self.conv_layer2(out_1)
return out_2
To define the generators, you're expected to use the above conv function, ResidualBlock class, and the below deconv helper function, which creates a transpose convolutional layer + an optional batchnorm layer.
# helper deconv function
def deconv(in_channels, out_channels, kernel_size, stride=2, padding=1, batch_norm=True):
"""Creates a transpose convolutional layer, with optional batch normalization.
"""
layers = []
# append transpose conv layer
layers.append(nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride, padding, bias=False))
# optional batch norm layer
if batch_norm:
layers.append(nn.BatchNorm2d(out_channels))
return nn.Sequential(*layers)
__init__ function with the specified 3 layer encoder convolutional net, a series of residual blocks (the number of which is given by n_res_blocks), and then a 3 layer decoder transpose convolutional net.forward function to define the forward behavior of the generators. Recall that the last layer has a tanh activation function.Both $G_{XtoY}$ and $G_{YtoX}$ have the same architecture, so we only need to define one class, and later instantiate two generators.
class CycleGenerator(nn.Module):
def __init__(self, conv_dim=64, n_res_blocks=6):
super(CycleGenerator, self).__init__()
# 1. Define the encoder part of the generator
# initial convolutional layer given, below
self.conv1 = conv(3, conv_dim, 4)
self.conv2 = conv(conv_dim, conv_dim*2, 4)
self.conv3 = conv(conv_dim*2, conv_dim*4, 4)
# 2. Define the resnet part of the generator
# Residual blocks
res_layers = []
for layer in range(n_res_blocks):
res_layers.append(ResidualBlock(conv_dim*4))
# use sequential to create these layers
self.res_blocks = nn.Sequential(*res_layers)
# 3. Define the decoder part of the generator
# two transpose convolutional layers and a third that looks a lot like the initial conv layer
self.deconv1 = deconv(conv_dim*4, conv_dim*2, 4)
self.deconv2 = deconv(conv_dim*2, conv_dim, 4)
# no batch norm on last layer
self.deconv3 = deconv(conv_dim, 3, 4, batch_norm=False)
def forward(self, x):
"""Given an image x, returns a transformed image."""
# define feedforward behavior, applying activations as necessary
out = F.relu(self.conv1(x))
out = F.relu(self.conv2(out))
out = F.relu(self.conv3(out))
out = self.res_blocks(out)
out = F.relu(self.deconv1(out))
out = F.relu(self.deconv2(out))
# tanh applied to last layer
out = F.tanh(self.deconv3(out))
return out
Using the classes you defined earlier, you can define the discriminators and generators necessary to create a complete CycleGAN. The given parameters should work for training.
First, create two discriminators, one for checking if $X$ sample images are real, and one for checking if $Y$ sample images are real. Then the generators. Instantiate two of them, one for transforming a painting into a realistic photo and one for transforming a photo into into a painting.
def create_model(g_conv_dim=64, d_conv_dim=64, n_res_blocks=6):
"""Builds the generators and discriminators."""
# Instantiate generators
G_XtoY = CycleGenerator(conv_dim=g_conv_dim, n_res_blocks=n_res_blocks)
G_YtoX = CycleGenerator(conv_dim=g_conv_dim, n_res_blocks=n_res_blocks)
# Instantiate discriminators
D_X = Discriminator(conv_dim=d_conv_dim)
D_Y = Discriminator(conv_dim=d_conv_dim)
# move models to GPU, if available
if torch.cuda.is_available():
device = torch.device("cuda:0")
G_XtoY.to(device)
G_YtoX.to(device)
D_X.to(device)
D_Y.to(device)
print('Models moved to GPU.')
else:
print('Only CPU available.')
return G_XtoY, G_YtoX, D_X, D_Y
# call the function to get models
G_XtoY, G_YtoX, D_X, D_Y = create_model()
Models moved to GPU.
The function create_model should return the two generator and two discriminator networks. After you've defined these discriminator and generator components, it's good practice to check your work. The easiest way to do this is to print out your model architecture and read through it to make sure the parameters are what you expected. The next cell will print out their architectures.
# helper function for printing the model architecture
def print_models(G_XtoY, G_YtoX, D_X, D_Y):
"""Prints model information for the generators and discriminators.
"""
print(" G_XtoY ")
print("-----------------------------------------------")
print(G_XtoY)
print()
print(" G_YtoX ")
print("-----------------------------------------------")
print(G_YtoX)
print()
print(" D_X ")
print("-----------------------------------------------")
print(D_X)
print()
print(" D_Y ")
print("-----------------------------------------------")
print(D_Y)
print()
# print all of the models
print_models(G_XtoY, G_YtoX, D_X, D_Y)
G_XtoY
-----------------------------------------------
CycleGenerator(
(conv1): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv2): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv3): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(res_blocks): Sequential(
(0): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(3): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(deconv1): Sequential(
(0): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(deconv2): Sequential(
(0): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(deconv3): Sequential(
(0): ConvTranspose2d(64, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
)
)
G_YtoX
-----------------------------------------------
CycleGenerator(
(conv1): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv2): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv3): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(res_blocks): Sequential(
(0): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(3): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): ResidualBlock(
(conv_layer1): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_layer2): Sequential(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(deconv1): Sequential(
(0): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(deconv2): Sequential(
(0): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(deconv3): Sequential(
(0): ConvTranspose2d(64, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
)
)
D_X
-----------------------------------------------
Discriminator(
(conv1): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
)
(conv2): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv3): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv4): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv5): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1), bias=False)
)
)
D_Y
-----------------------------------------------
Discriminator(
(conv1): Sequential(
(0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
)
(conv2): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv3): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv4): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv5): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1), bias=False)
)
)
Computing the discriminator and the generator losses are key to getting a CycleGAN to train.

Image from original paper by Jun-Yan Zhu et. al.
The CycleGAN contains two mapping functions $G: X \rightarrow Y$ and $F: Y \rightarrow X$, and associated adversarial discriminators $D_Y$ and $D_X$. (a) $D_Y$ encourages $G$ to translate $X$ into outputs indistinguishable from domain $Y$, and vice versa for $D_X$ and $F$.
To further regularize the mappings, we introduce two cycle consistency losses that capture the intuition that if we translate from one domain to the other and back again we should arrive at where we started. (b) Forward cycle-consistency loss and (c) backward cycle-consistency loss.
We've seen that regular GANs treat the discriminator as a classifier with the sigmoid cross entropy loss function. However, this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we'll use a least squares loss function for the discriminator. This structure is also referred to as a least squares GAN or LSGAN, and you can read the original paper on LSGANs, here. The authors show that LSGANs are able to generate higher quality images than regular GANs and that this loss type is a bit more stable during training!
The discriminator losses will be mean squared errors between the output of the discriminator, given an image, and the target value, 0 or 1, depending on whether it should classify that image as fake or real. For example, for a real image, x, we can train $D_X$ by looking at how close it is to recognizing and image x as real using the mean squared error:
out_x = D_X(x)
real_err = torch.mean((out_x-1)**2)
Calculating the generator losses will look somewhat similar to calculating the discriminator loss; there will still be steps in which you generate fake images that look like they belong to the set of $X$ images but are based on real images in set $Y$, and vice versa. You'll compute the "real loss" on those generated images by looking at the output of the discriminator as it's applied to these fake images; this time, your generator aims to make the discriminator classify these fake images as real images.
In addition to the adversarial losses, the generator loss terms will also include the cycle consistency loss. This loss is a measure of how good a reconstructed image is, when compared to an original image.
Say you have a fake, generated image, x_hat, and a real image, y. You can get a reconstructed y_hat by applying G_XtoY(x_hat) = y_hat and then check to see if this reconstruction y_hat and the orginal image y match. For this, we recommed calculating the L1 loss, which is an absolute difference, between reconstructed and real images. You may also choose to multiply this loss by some weight value lambda_weight to convey its importance.

The total generator loss will be the sum of the generator losses and the forward and backward cycle consistency losses.
To help us calculate the discriminator and gnerator losses during training, let's define some helpful loss functions. Here, we'll define three.
real_mse_loss that looks at the output of a discriminator and returns the error based on how close that output is to being classified as real. This should be a mean squared error.fake_mse_loss that looks at the output of a discriminator and returns the error based on how close that output is to being classified as fake. This should be a mean squared error.cycle_consistency_loss that looks at a set of real image and a set of reconstructed/generated images, and returns the mean absolute error between them. This has a lambda_weight parameter that will weight the mean absolute error in a batch.It's recommended that you take a look at the original, CycleGAN paper to get a starting value for lambda_weight.
def real_mse_loss(D_out):
# how close is the produced output from being "real"?
return torch.mean((D_out-1)**2)
def fake_mse_loss(D_out):
# how close is the produced output from being "false"?
return torch.mean(D_out**2)
def cycle_consistency_loss(real_im, reconstructed_im, lambda_weight):
# calculate reconstruction loss
# as absolute value difference between the real and reconstructed images
reconstr_loss = torch.mean(torch.abs(real_im - reconstructed_im))
# return weighted loss
return lambda_weight*reconstr_loss
Next, let's define how this model will update its weights. This, like the GANs you may have seen before, uses Adam optimizers for the discriminator and generator. It's again recommended that you take a look at the original, CycleGAN paper to get starting hyperparameter values.
import torch.optim as optim
# hyperparams for Adam optimizer
lr=0.0002
beta1=0.5
beta2=0.999 # default value
g_params = list(G_XtoY.parameters()) + list(G_YtoX.parameters()) # Get generator parameters
# Create optimizers for the generators and discriminators
g_optimizer = optim.Adam(g_params, lr, [beta1, beta2])
d_x_optimizer = optim.Adam(D_X.parameters(), lr, [beta1, beta2])
d_y_optimizer = optim.Adam(D_Y.parameters(), lr, [beta1, beta2])
When a CycleGAN trains, and sees one batch of real images from set $X$ and $Y$, it trains by performing the following steps:
Training the Discriminators
Training the Generators

A CycleGAN repeats its training process, alternating between training the discriminators and the generators, for a specified number of training iterations. You've been given code that will save some example generated images that the CycleGAN has learned to generate after a certain number of training iterations. Along with looking at the losses, these example generations should give you an idea of how well your network has trained.
Below, you may choose to keep all default parameters; your only task is to calculate the appropriate losses and complete the training cycle.
# import save code
from helpers import save_samples, checkpoint
# train the network
def training_loop(dataloader_X, dataloader_Y, test_dataloader_X, test_dataloader_Y,
n_epochs=1000):
print_every=10
# keep track of losses over time
losses = []
test_iter_X = iter(test_dataloader_X)
test_iter_Y = iter(test_dataloader_Y)
# Get some fixed data from domains X and Y for sampling. These are images that are held
# constant throughout training, that allow us to inspect the model's performance.
fixed_X = test_iter_X.next()[0]
fixed_Y = test_iter_Y.next()[0]
fixed_X = scale(fixed_X) # make sure to scale to a range -1 to 1
fixed_Y = scale(fixed_Y)
# batches per epoch
iter_X = iter(dataloader_X)
iter_Y = iter(dataloader_Y)
batches_per_epoch = min(len(iter_X), len(iter_Y))
for epoch in range(1, n_epochs+1):
# Reset iterators for each epoch
if epoch % batches_per_epoch == 0:
iter_X = iter(dataloader_X)
iter_Y = iter(dataloader_Y)
images_X, _ = iter_X.next()
images_X = scale(images_X) # make sure to scale to a range -1 to 1
images_Y, _ = iter_Y.next()
images_Y = scale(images_Y)
# move images to GPU if available (otherwise stay on CPU)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
images_X = images_X.to(device)
images_Y = images_Y.to(device)
# ============================================
# TRAIN THE DISCRIMINATORS
# ============================================
## First: D_X, real and fake loss components ##
# Train with real images
d_x_optimizer.zero_grad()
# 1. Compute the discriminator losses on real images
out_x = D_X(images_X)
D_X_real_loss = real_mse_loss(out_x)
# Train with fake images
# 2. Generate fake images that look like domain X based on real images in domain Y
fake_X = G_YtoX(images_Y)
# 3. Compute the fake loss for D_X
out_x = D_X(fake_X)
D_X_fake_loss = fake_mse_loss(out_x)
# 4. Compute the total loss and perform backprop
d_x_loss = D_X_real_loss + D_X_fake_loss
d_x_loss.backward()
d_x_optimizer.step()
## Second: D_Y, real and fake loss components ##
# Train with real images
d_y_optimizer.zero_grad()
# 1. Compute the discriminator losses on real images
out_y = D_Y(images_Y)
D_Y_real_loss = real_mse_loss(out_y)
# Train with fake images
# 2. Generate fake images that look like domain Y based on real images in domain X
fake_Y = G_XtoY(images_X)
# 3. Compute the fake loss for D_Y
out_y = D_Y(fake_Y)
D_Y_fake_loss = fake_mse_loss(out_y)
# 4. Compute the total loss and perform backprop
d_y_loss = D_Y_real_loss + D_Y_fake_loss
d_y_loss.backward()
d_y_optimizer.step()
# =========================================
# TRAIN THE GENERATORS
# =========================================
## First: generate fake X images and reconstructed Y images ##
g_optimizer.zero_grad()
# 1. Generate fake images that look like domain X based on real images in domain Y
fake_X = G_YtoX(images_Y)
# 2. Compute the generator loss based on domain X
out_x = D_X(fake_X)
g_YtoX_loss = real_mse_loss(out_x)
# 3. Create a reconstructed y
# 4. Compute the cycle consistency loss (the reconstruction loss)
reconstructed_Y = G_XtoY(fake_X)
reconstructed_y_loss = cycle_consistency_loss(images_Y, reconstructed_Y, lambda_weight=10)
## Second: generate fake Y images and reconstructed X images ##
# 1. Generate fake images that look like domain Y based on real images in domain X
fake_Y = G_XtoY(images_X)
# 2. Compute the generator loss based on domain Y
out_y = D_Y(fake_Y)
g_XtoY_loss = real_mse_loss(out_y)
# 3. Create a reconstructed x
# 4. Compute the cycle consistency loss (the reconstruction loss)
reconstructed_X = G_YtoX(fake_Y)
reconstructed_x_loss = cycle_consistency_loss(images_X, reconstructed_X, lambda_weight=10)
# 5. Add up all generator and reconstructed losses and perform backprop
g_total_loss = g_YtoX_loss + g_XtoY_loss + reconstructed_y_loss + reconstructed_x_loss
g_total_loss.backward()
g_optimizer.step()
# Print the log info
if epoch % print_every == 0:
# append real and fake discriminator losses and the generator loss
losses.append((d_x_loss.item(), d_y_loss.item(), g_total_loss.item()))
print('Epoch [{:5d}/{:5d}] | d_X_loss: {:6.4f} | d_Y_loss: {:6.4f} | g_total_loss: {:6.4f}'.format(
epoch, n_epochs, d_x_loss.item(), d_y_loss.item(), g_total_loss.item()))
sample_every=100
# Save the generated samples
if epoch % sample_every == 0:
G_YtoX.eval() # set generators to eval mode for sample generation
G_XtoY.eval()
save_samples(epoch, fixed_Y, fixed_X, G_YtoX, G_XtoY, batch_size=16)
G_YtoX.train()
G_XtoY.train()
# uncomment these lines, if you want to save your model
# checkpoint_every=1000
# # Save the model parameters
# if epoch % checkpoint_every == 0:
# checkpoint(epoch, G_XtoY, G_YtoX, D_X, D_Y)
return losses
n_epochs = 4000 # keep this small when testing if a model first works
losses = training_loop(dataloader_X, dataloader_Y, test_dataloader_X, test_dataloader_Y, n_epochs=n_epochs)
Epoch [ 10/ 4000] | d_X_loss: 0.3958 | d_Y_loss: 0.4528 | g_total_loss: 9.4696 Epoch [ 20/ 4000] | d_X_loss: 0.2927 | d_Y_loss: 0.4160 | g_total_loss: 8.6626 Epoch [ 30/ 4000] | d_X_loss: 0.1057 | d_Y_loss: 0.4825 | g_total_loss: 7.2664 Epoch [ 40/ 4000] | d_X_loss: 0.2113 | d_Y_loss: 0.3687 | g_total_loss: 7.1753 Epoch [ 50/ 4000] | d_X_loss: 0.3938 | d_Y_loss: 0.4327 | g_total_loss: 5.7858 Epoch [ 60/ 4000] | d_X_loss: 0.4369 | d_Y_loss: 0.3721 | g_total_loss: 6.0677 Epoch [ 70/ 4000] | d_X_loss: 0.5746 | d_Y_loss: 0.4035 | g_total_loss: 5.5985 Epoch [ 80/ 4000] | d_X_loss: 0.3245 | d_Y_loss: 0.6153 | g_total_loss: 5.2730 Epoch [ 90/ 4000] | d_X_loss: 0.3823 | d_Y_loss: 0.2637 | g_total_loss: 5.8418 Epoch [ 100/ 4000] | d_X_loss: 0.5089 | d_Y_loss: 0.3039 | g_total_loss: 5.3163 Saved samples_cyclegan/sample-000100-X-Y.png Saved samples_cyclegan/sample-000100-Y-X.png Epoch [ 110/ 4000] | d_X_loss: 0.4999 | d_Y_loss: 0.3664 | g_total_loss: 5.0403 Epoch [ 120/ 4000] | d_X_loss: 0.3447 | d_Y_loss: 0.3905 | g_total_loss: 4.9195 Epoch [ 130/ 4000] | d_X_loss: 0.3520 | d_Y_loss: 0.4624 | g_total_loss: 4.6650 Epoch [ 140/ 4000] | d_X_loss: 0.3706 | d_Y_loss: 0.4946 | g_total_loss: 4.4959 Epoch [ 150/ 4000] | d_X_loss: 0.4719 | d_Y_loss: 0.3111 | g_total_loss: 5.6157 Epoch [ 160/ 4000] | d_X_loss: 0.4138 | d_Y_loss: 0.3682 | g_total_loss: 4.8214 Epoch [ 170/ 4000] | d_X_loss: 0.4636 | d_Y_loss: 0.4448 | g_total_loss: 4.7001 Epoch [ 180/ 4000] | d_X_loss: 0.5273 | d_Y_loss: 0.4718 | g_total_loss: 4.8506 Epoch [ 190/ 4000] | d_X_loss: 0.3773 | d_Y_loss: 0.3439 | g_total_loss: 4.2842 Epoch [ 200/ 4000] | d_X_loss: 0.3530 | d_Y_loss: 0.3974 | g_total_loss: 4.2591 Saved samples_cyclegan/sample-000200-X-Y.png Saved samples_cyclegan/sample-000200-Y-X.png Epoch [ 210/ 4000] | d_X_loss: 0.2833 | d_Y_loss: 0.3571 | g_total_loss: 4.5734 Epoch [ 220/ 4000] | d_X_loss: 0.4130 | d_Y_loss: 0.4411 | g_total_loss: 4.6487 Epoch [ 230/ 4000] | d_X_loss: 0.3631 | d_Y_loss: 0.4969 | g_total_loss: 4.3084 Epoch [ 240/ 4000] | d_X_loss: 0.4470 | d_Y_loss: 0.4444 | g_total_loss: 4.1436 Epoch [ 250/ 4000] | d_X_loss: 0.4731 | d_Y_loss: 0.4438 | g_total_loss: 4.7368 Epoch [ 260/ 4000] | d_X_loss: 0.4606 | d_Y_loss: 0.4062 | g_total_loss: 4.4761 Epoch [ 270/ 4000] | d_X_loss: 0.3606 | d_Y_loss: 0.7543 | g_total_loss: 5.5468 Epoch [ 280/ 4000] | d_X_loss: 0.4668 | d_Y_loss: 0.4361 | g_total_loss: 4.2389 Epoch [ 290/ 4000] | d_X_loss: 0.4978 | d_Y_loss: 0.5004 | g_total_loss: 3.7817 Epoch [ 300/ 4000] | d_X_loss: 0.3795 | d_Y_loss: 0.3967 | g_total_loss: 3.9241 Saved samples_cyclegan/sample-000300-X-Y.png Saved samples_cyclegan/sample-000300-Y-X.png Epoch [ 310/ 4000] | d_X_loss: 0.4100 | d_Y_loss: 0.3647 | g_total_loss: 4.1676 Epoch [ 320/ 4000] | d_X_loss: 0.4326 | d_Y_loss: 0.3502 | g_total_loss: 3.9641 Epoch [ 330/ 4000] | d_X_loss: 0.4853 | d_Y_loss: 0.3476 | g_total_loss: 4.0044 Epoch [ 340/ 4000] | d_X_loss: 0.4375 | d_Y_loss: 0.4410 | g_total_loss: 4.1737 Epoch [ 350/ 4000] | d_X_loss: 0.4275 | d_Y_loss: 0.3342 | g_total_loss: 4.0098 Epoch [ 360/ 4000] | d_X_loss: 0.4653 | d_Y_loss: 0.5696 | g_total_loss: 4.0303 Epoch [ 370/ 4000] | d_X_loss: 0.3398 | d_Y_loss: 0.4646 | g_total_loss: 4.2594 Epoch [ 380/ 4000] | d_X_loss: 0.3682 | d_Y_loss: 0.4882 | g_total_loss: 3.9387 Epoch [ 390/ 4000] | d_X_loss: 0.2931 | d_Y_loss: 0.4569 | g_total_loss: 4.1545 Epoch [ 400/ 4000] | d_X_loss: 0.4850 | d_Y_loss: 0.6202 | g_total_loss: 4.5546 Saved samples_cyclegan/sample-000400-X-Y.png Saved samples_cyclegan/sample-000400-Y-X.png Epoch [ 410/ 4000] | d_X_loss: 0.5042 | d_Y_loss: 0.3679 | g_total_loss: 4.9290 Epoch [ 420/ 4000] | d_X_loss: 0.4301 | d_Y_loss: 0.3999 | g_total_loss: 3.6982 Epoch [ 430/ 4000] | d_X_loss: 0.4338 | d_Y_loss: 0.4282 | g_total_loss: 4.5794 Epoch [ 440/ 4000] | d_X_loss: 0.4072 | d_Y_loss: 0.4188 | g_total_loss: 3.8231 Epoch [ 450/ 4000] | d_X_loss: 0.3697 | d_Y_loss: 0.2884 | g_total_loss: 4.1580 Epoch [ 460/ 4000] | d_X_loss: 0.3991 | d_Y_loss: 0.4370 | g_total_loss: 3.8552 Epoch [ 470/ 4000] | d_X_loss: 0.4677 | d_Y_loss: 0.5313 | g_total_loss: 3.5911 Epoch [ 480/ 4000] | d_X_loss: 0.4175 | d_Y_loss: 0.5062 | g_total_loss: 4.3155 Epoch [ 490/ 4000] | d_X_loss: 0.3512 | d_Y_loss: 0.4501 | g_total_loss: 4.4781 Epoch [ 500/ 4000] | d_X_loss: 0.4325 | d_Y_loss: 0.4325 | g_total_loss: 4.0386 Saved samples_cyclegan/sample-000500-X-Y.png Saved samples_cyclegan/sample-000500-Y-X.png Epoch [ 510/ 4000] | d_X_loss: 0.4346 | d_Y_loss: 0.3826 | g_total_loss: 3.7624 Epoch [ 520/ 4000] | d_X_loss: 0.4285 | d_Y_loss: 0.3654 | g_total_loss: 4.0454 Epoch [ 530/ 4000] | d_X_loss: 0.4663 | d_Y_loss: 0.3757 | g_total_loss: 3.8003 Epoch [ 540/ 4000] | d_X_loss: 0.4976 | d_Y_loss: 0.4451 | g_total_loss: 4.0646 Epoch [ 550/ 4000] | d_X_loss: 0.4825 | d_Y_loss: 0.3442 | g_total_loss: 3.9697 Epoch [ 560/ 4000] | d_X_loss: 0.3113 | d_Y_loss: 0.4040 | g_total_loss: 4.6918 Epoch [ 570/ 4000] | d_X_loss: 0.3384 | d_Y_loss: 0.3550 | g_total_loss: 4.1619 Epoch [ 580/ 4000] | d_X_loss: 0.4026 | d_Y_loss: 0.4575 | g_total_loss: 4.2190 Epoch [ 590/ 4000] | d_X_loss: 0.5437 | d_Y_loss: 0.4402 | g_total_loss: 4.9007 Epoch [ 600/ 4000] | d_X_loss: 0.3645 | d_Y_loss: 0.3961 | g_total_loss: 3.6801 Saved samples_cyclegan/sample-000600-X-Y.png Saved samples_cyclegan/sample-000600-Y-X.png Epoch [ 610/ 4000] | d_X_loss: 0.3899 | d_Y_loss: 0.3716 | g_total_loss: 4.6374 Epoch [ 620/ 4000] | d_X_loss: 0.5180 | d_Y_loss: 0.4266 | g_total_loss: 4.0794 Epoch [ 630/ 4000] | d_X_loss: 0.4673 | d_Y_loss: 0.6168 | g_total_loss: 3.8071 Epoch [ 640/ 4000] | d_X_loss: 0.4547 | d_Y_loss: 0.3096 | g_total_loss: 4.5614 Epoch [ 650/ 4000] | d_X_loss: 0.5129 | d_Y_loss: 0.3784 | g_total_loss: 4.0313 Epoch [ 660/ 4000] | d_X_loss: 0.4850 | d_Y_loss: 0.3121 | g_total_loss: 3.7061 Epoch [ 670/ 4000] | d_X_loss: 0.3178 | d_Y_loss: 0.2921 | g_total_loss: 4.1195 Epoch [ 680/ 4000] | d_X_loss: 0.4066 | d_Y_loss: 0.5008 | g_total_loss: 3.9960 Epoch [ 690/ 4000] | d_X_loss: 0.4194 | d_Y_loss: 0.4699 | g_total_loss: 3.9686 Epoch [ 700/ 4000] | d_X_loss: 0.3751 | d_Y_loss: 0.4119 | g_total_loss: 4.0735 Saved samples_cyclegan/sample-000700-X-Y.png Saved samples_cyclegan/sample-000700-Y-X.png Epoch [ 710/ 4000] | d_X_loss: 0.5532 | d_Y_loss: 0.3772 | g_total_loss: 4.1204 Epoch [ 720/ 4000] | d_X_loss: 0.4983 | d_Y_loss: 0.4649 | g_total_loss: 3.8425 Epoch [ 730/ 4000] | d_X_loss: 0.4583 | d_Y_loss: 0.3580 | g_total_loss: 3.9373 Epoch [ 740/ 4000] | d_X_loss: 0.5571 | d_Y_loss: 0.5231 | g_total_loss: 4.1644 Epoch [ 750/ 4000] | d_X_loss: 0.3470 | d_Y_loss: 0.3779 | g_total_loss: 4.0781 Epoch [ 760/ 4000] | d_X_loss: 0.4300 | d_Y_loss: 0.4396 | g_total_loss: 3.7001 Epoch [ 770/ 4000] | d_X_loss: 0.4084 | d_Y_loss: 0.2979 | g_total_loss: 4.0096 Epoch [ 780/ 4000] | d_X_loss: 0.4412 | d_Y_loss: 0.4371 | g_total_loss: 4.0965 Epoch [ 790/ 4000] | d_X_loss: 0.3288 | d_Y_loss: 0.4292 | g_total_loss: 4.0511 Epoch [ 800/ 4000] | d_X_loss: 0.3966 | d_Y_loss: 0.4337 | g_total_loss: 3.4207 Saved samples_cyclegan/sample-000800-X-Y.png Saved samples_cyclegan/sample-000800-Y-X.png Epoch [ 810/ 4000] | d_X_loss: 0.3714 | d_Y_loss: 0.5531 | g_total_loss: 3.3714 Epoch [ 820/ 4000] | d_X_loss: 0.4521 | d_Y_loss: 0.4413 | g_total_loss: 3.9536 Epoch [ 830/ 4000] | d_X_loss: 0.4333 | d_Y_loss: 0.4751 | g_total_loss: 3.5833 Epoch [ 840/ 4000] | d_X_loss: 0.3427 | d_Y_loss: 0.3983 | g_total_loss: 4.4171 Epoch [ 850/ 4000] | d_X_loss: 0.4748 | d_Y_loss: 0.3613 | g_total_loss: 4.2533 Epoch [ 860/ 4000] | d_X_loss: 0.4127 | d_Y_loss: 0.3467 | g_total_loss: 3.7732 Epoch [ 870/ 4000] | d_X_loss: 0.4778 | d_Y_loss: 0.3335 | g_total_loss: 3.9159 Epoch [ 880/ 4000] | d_X_loss: 0.3983 | d_Y_loss: 0.5284 | g_total_loss: 3.9287 Epoch [ 890/ 4000] | d_X_loss: 0.4359 | d_Y_loss: 0.4131 | g_total_loss: 3.6775 Epoch [ 900/ 4000] | d_X_loss: 0.4517 | d_Y_loss: 0.3534 | g_total_loss: 4.0543 Saved samples_cyclegan/sample-000900-X-Y.png Saved samples_cyclegan/sample-000900-Y-X.png Epoch [ 910/ 4000] | d_X_loss: 0.4132 | d_Y_loss: 0.3711 | g_total_loss: 3.5335 Epoch [ 920/ 4000] | d_X_loss: 0.4371 | d_Y_loss: 0.7068 | g_total_loss: 3.7064 Epoch [ 930/ 4000] | d_X_loss: 0.4421 | d_Y_loss: 0.4297 | g_total_loss: 3.7903 Epoch [ 940/ 4000] | d_X_loss: 0.5256 | d_Y_loss: 0.3409 | g_total_loss: 4.2808 Epoch [ 950/ 4000] | d_X_loss: 0.4924 | d_Y_loss: 0.5120 | g_total_loss: 4.0589 Epoch [ 960/ 4000] | d_X_loss: 0.4503 | d_Y_loss: 0.3685 | g_total_loss: 3.5139 Epoch [ 970/ 4000] | d_X_loss: 0.4898 | d_Y_loss: 0.4274 | g_total_loss: 3.8152 Epoch [ 980/ 4000] | d_X_loss: 0.3585 | d_Y_loss: 0.4812 | g_total_loss: 3.3480 Epoch [ 990/ 4000] | d_X_loss: 0.3314 | d_Y_loss: 0.2865 | g_total_loss: 4.0738 Epoch [ 1000/ 4000] | d_X_loss: 0.4100 | d_Y_loss: 0.4067 | g_total_loss: 3.6095 Saved samples_cyclegan/sample-001000-X-Y.png Saved samples_cyclegan/sample-001000-Y-X.png Epoch [ 1010/ 4000] | d_X_loss: 0.5186 | d_Y_loss: 0.3508 | g_total_loss: 4.1594 Epoch [ 1020/ 4000] | d_X_loss: 0.4703 | d_Y_loss: 0.4720 | g_total_loss: 4.1957 Epoch [ 1030/ 4000] | d_X_loss: 0.3464 | d_Y_loss: 0.3117 | g_total_loss: 3.6171 Epoch [ 1040/ 4000] | d_X_loss: 0.4272 | d_Y_loss: 0.2369 | g_total_loss: 3.7737 Epoch [ 1050/ 4000] | d_X_loss: 0.5727 | d_Y_loss: 0.3462 | g_total_loss: 4.3117 Epoch [ 1060/ 4000] | d_X_loss: 0.4288 | d_Y_loss: 0.6680 | g_total_loss: 3.3234 Epoch [ 1070/ 4000] | d_X_loss: 0.3597 | d_Y_loss: 0.3552 | g_total_loss: 3.8147 Epoch [ 1080/ 4000] | d_X_loss: 0.3398 | d_Y_loss: 0.3865 | g_total_loss: 4.4149 Epoch [ 1090/ 4000] | d_X_loss: 0.4148 | d_Y_loss: 0.2571 | g_total_loss: 4.1151 Epoch [ 1100/ 4000] | d_X_loss: 0.3463 | d_Y_loss: 0.3506 | g_total_loss: 3.5968 Saved samples_cyclegan/sample-001100-X-Y.png Saved samples_cyclegan/sample-001100-Y-X.png Epoch [ 1110/ 4000] | d_X_loss: 0.4116 | d_Y_loss: 0.3019 | g_total_loss: 3.7438 Epoch [ 1120/ 4000] | d_X_loss: 0.3868 | d_Y_loss: 0.3859 | g_total_loss: 4.0059 Epoch [ 1130/ 4000] | d_X_loss: 0.3767 | d_Y_loss: 0.3889 | g_total_loss: 3.5743 Epoch [ 1140/ 4000] | d_X_loss: 0.5767 | d_Y_loss: 0.3833 | g_total_loss: 3.7889 Epoch [ 1150/ 4000] | d_X_loss: 0.4731 | d_Y_loss: 0.3772 | g_total_loss: 3.5062 Epoch [ 1160/ 4000] | d_X_loss: 0.4634 | d_Y_loss: 0.3134 | g_total_loss: 4.1293 Epoch [ 1170/ 4000] | d_X_loss: 0.4096 | d_Y_loss: 0.3610 | g_total_loss: 3.7504 Epoch [ 1180/ 4000] | d_X_loss: 0.3584 | d_Y_loss: 0.3317 | g_total_loss: 3.5340 Epoch [ 1190/ 4000] | d_X_loss: 0.3846 | d_Y_loss: 0.3783 | g_total_loss: 4.0327 Epoch [ 1200/ 4000] | d_X_loss: 0.4083 | d_Y_loss: 0.3054 | g_total_loss: 4.2000 Saved samples_cyclegan/sample-001200-X-Y.png Saved samples_cyclegan/sample-001200-Y-X.png Epoch [ 1210/ 4000] | d_X_loss: 0.4354 | d_Y_loss: 0.3495 | g_total_loss: 3.5759 Epoch [ 1220/ 4000] | d_X_loss: 0.3748 | d_Y_loss: 0.3762 | g_total_loss: 3.9294 Epoch [ 1230/ 4000] | d_X_loss: 0.3367 | d_Y_loss: 0.4489 | g_total_loss: 4.1623 Epoch [ 1240/ 4000] | d_X_loss: 0.3135 | d_Y_loss: 0.3957 | g_total_loss: 3.2557 Epoch [ 1250/ 4000] | d_X_loss: 0.4706 | d_Y_loss: 0.2646 | g_total_loss: 3.4710 Epoch [ 1260/ 4000] | d_X_loss: 0.3830 | d_Y_loss: 0.3323 | g_total_loss: 3.7209 Epoch [ 1270/ 4000] | d_X_loss: 0.3658 | d_Y_loss: 0.3719 | g_total_loss: 4.8712 Epoch [ 1280/ 4000] | d_X_loss: 0.2974 | d_Y_loss: 0.4189 | g_total_loss: 4.8283 Epoch [ 1290/ 4000] | d_X_loss: 0.4022 | d_Y_loss: 0.2567 | g_total_loss: 4.2263 Epoch [ 1300/ 4000] | d_X_loss: 0.4013 | d_Y_loss: 0.3799 | g_total_loss: 3.7673 Saved samples_cyclegan/sample-001300-X-Y.png Saved samples_cyclegan/sample-001300-Y-X.png Epoch [ 1310/ 4000] | d_X_loss: 0.4453 | d_Y_loss: 0.3027 | g_total_loss: 3.9327 Epoch [ 1320/ 4000] | d_X_loss: 0.4092 | d_Y_loss: 0.3367 | g_total_loss: 3.5698 Epoch [ 1330/ 4000] | d_X_loss: 0.4051 | d_Y_loss: 0.3373 | g_total_loss: 3.9174 Epoch [ 1340/ 4000] | d_X_loss: 0.4716 | d_Y_loss: 0.3057 | g_total_loss: 3.8844 Epoch [ 1350/ 4000] | d_X_loss: 0.3038 | d_Y_loss: 0.3413 | g_total_loss: 3.8874 Epoch [ 1360/ 4000] | d_X_loss: 0.4780 | d_Y_loss: 0.2594 | g_total_loss: 3.6004 Epoch [ 1370/ 4000] | d_X_loss: 0.4047 | d_Y_loss: 0.3518 | g_total_loss: 4.6455 Epoch [ 1380/ 4000] | d_X_loss: 0.4479 | d_Y_loss: 0.4917 | g_total_loss: 4.0354 Epoch [ 1390/ 4000] | d_X_loss: 0.3422 | d_Y_loss: 0.2367 | g_total_loss: 3.7476 Epoch [ 1400/ 4000] | d_X_loss: 0.3821 | d_Y_loss: 0.4471 | g_total_loss: 3.9943 Saved samples_cyclegan/sample-001400-X-Y.png Saved samples_cyclegan/sample-001400-Y-X.png Epoch [ 1410/ 4000] | d_X_loss: 0.2802 | d_Y_loss: 0.3174 | g_total_loss: 3.3933 Epoch [ 1420/ 4000] | d_X_loss: 0.3492 | d_Y_loss: 0.2663 | g_total_loss: 4.3483 Epoch [ 1430/ 4000] | d_X_loss: 0.5070 | d_Y_loss: 0.3689 | g_total_loss: 3.6532 Epoch [ 1440/ 4000] | d_X_loss: 0.3318 | d_Y_loss: 0.2989 | g_total_loss: 3.9910 Epoch [ 1450/ 4000] | d_X_loss: 0.3151 | d_Y_loss: 0.2454 | g_total_loss: 3.9948 Epoch [ 1460/ 4000] | d_X_loss: 0.2559 | d_Y_loss: 0.4991 | g_total_loss: 3.3297 Epoch [ 1470/ 4000] | d_X_loss: 0.3627 | d_Y_loss: 0.3085 | g_total_loss: 3.6701 Epoch [ 1480/ 4000] | d_X_loss: 0.4403 | d_Y_loss: 0.2893 | g_total_loss: 3.0209 Epoch [ 1490/ 4000] | d_X_loss: 0.3969 | d_Y_loss: 0.3619 | g_total_loss: 3.5297 Epoch [ 1500/ 4000] | d_X_loss: 0.3534 | d_Y_loss: 0.4633 | g_total_loss: 4.6977 Saved samples_cyclegan/sample-001500-X-Y.png Saved samples_cyclegan/sample-001500-Y-X.png Epoch [ 1510/ 4000] | d_X_loss: 0.4529 | d_Y_loss: 0.2683 | g_total_loss: 3.5203 Epoch [ 1520/ 4000] | d_X_loss: 0.3351 | d_Y_loss: 0.3328 | g_total_loss: 3.7003 Epoch [ 1530/ 4000] | d_X_loss: 0.4044 | d_Y_loss: 0.2457 | g_total_loss: 3.5270 Epoch [ 1540/ 4000] | d_X_loss: 0.3239 | d_Y_loss: 0.2747 | g_total_loss: 3.5087 Epoch [ 1550/ 4000] | d_X_loss: 0.3362 | d_Y_loss: 0.3768 | g_total_loss: 4.9243 Epoch [ 1560/ 4000] | d_X_loss: 0.2995 | d_Y_loss: 0.4274 | g_total_loss: 4.3883 Epoch [ 1570/ 4000] | d_X_loss: 0.3536 | d_Y_loss: 0.2188 | g_total_loss: 4.6758 Epoch [ 1580/ 4000] | d_X_loss: 0.4287 | d_Y_loss: 0.3455 | g_total_loss: 4.5181 Epoch [ 1590/ 4000] | d_X_loss: 0.3647 | d_Y_loss: 0.1996 | g_total_loss: 3.8649 Epoch [ 1600/ 4000] | d_X_loss: 0.3920 | d_Y_loss: 0.1638 | g_total_loss: 3.9747 Saved samples_cyclegan/sample-001600-X-Y.png Saved samples_cyclegan/sample-001600-Y-X.png Epoch [ 1610/ 4000] | d_X_loss: 0.3506 | d_Y_loss: 0.1973 | g_total_loss: 4.1645 Epoch [ 1620/ 4000] | d_X_loss: 0.2594 | d_Y_loss: 0.2510 | g_total_loss: 4.3657 Epoch [ 1630/ 4000] | d_X_loss: 0.3317 | d_Y_loss: 0.2214 | g_total_loss: 3.9909 Epoch [ 1640/ 4000] | d_X_loss: 0.2968 | d_Y_loss: 0.2595 | g_total_loss: 4.1298 Epoch [ 1650/ 4000] | d_X_loss: 0.3198 | d_Y_loss: 0.1320 | g_total_loss: 5.0270 Epoch [ 1660/ 4000] | d_X_loss: 0.3024 | d_Y_loss: 0.2426 | g_total_loss: 4.0094 Epoch [ 1670/ 4000] | d_X_loss: 0.2632 | d_Y_loss: 0.2956 | g_total_loss: 3.9935 Epoch [ 1680/ 4000] | d_X_loss: 0.3003 | d_Y_loss: 0.2843 | g_total_loss: 4.2955 Epoch [ 1690/ 4000] | d_X_loss: 0.3582 | d_Y_loss: 0.3504 | g_total_loss: 4.1516 Epoch [ 1700/ 4000] | d_X_loss: 0.1940 | d_Y_loss: 0.2469 | g_total_loss: 4.2701 Saved samples_cyclegan/sample-001700-X-Y.png Saved samples_cyclegan/sample-001700-Y-X.png Epoch [ 1710/ 4000] | d_X_loss: 0.2568 | d_Y_loss: 0.2306 | g_total_loss: 4.6020 Epoch [ 1720/ 4000] | d_X_loss: 0.3588 | d_Y_loss: 0.2291 | g_total_loss: 4.3582 Epoch [ 1730/ 4000] | d_X_loss: 0.2746 | d_Y_loss: 0.1809 | g_total_loss: 4.9615 Epoch [ 1740/ 4000] | d_X_loss: 0.3345 | d_Y_loss: 0.2796 | g_total_loss: 3.8268 Epoch [ 1750/ 4000] | d_X_loss: 0.2173 | d_Y_loss: 0.2468 | g_total_loss: 4.6300 Epoch [ 1760/ 4000] | d_X_loss: 0.3532 | d_Y_loss: 0.4676 | g_total_loss: 4.0815 Epoch [ 1770/ 4000] | d_X_loss: 0.4857 | d_Y_loss: 0.2454 | g_total_loss: 6.0277 Epoch [ 1780/ 4000] | d_X_loss: 0.3690 | d_Y_loss: 0.1703 | g_total_loss: 3.8203 Epoch [ 1790/ 4000] | d_X_loss: 0.2072 | d_Y_loss: 0.2146 | g_total_loss: 4.6067 Epoch [ 1800/ 4000] | d_X_loss: 0.3226 | d_Y_loss: 0.3547 | g_total_loss: 3.6608 Saved samples_cyclegan/sample-001800-X-Y.png Saved samples_cyclegan/sample-001800-Y-X.png Epoch [ 1810/ 4000] | d_X_loss: 0.3613 | d_Y_loss: 0.2225 | g_total_loss: 4.8653 Epoch [ 1820/ 4000] | d_X_loss: 0.2916 | d_Y_loss: 0.1634 | g_total_loss: 4.5669 Epoch [ 1830/ 4000] | d_X_loss: 0.4177 | d_Y_loss: 0.2754 | g_total_loss: 4.7970 Epoch [ 1840/ 4000] | d_X_loss: 0.2460 | d_Y_loss: 0.2808 | g_total_loss: 3.4729 Epoch [ 1850/ 4000] | d_X_loss: 0.4846 | d_Y_loss: 0.2273 | g_total_loss: 3.6149 Epoch [ 1860/ 4000] | d_X_loss: 0.3198 | d_Y_loss: 0.1878 | g_total_loss: 4.1438 Epoch [ 1870/ 4000] | d_X_loss: 0.2344 | d_Y_loss: 0.1614 | g_total_loss: 4.0786 Epoch [ 1880/ 4000] | d_X_loss: 0.3318 | d_Y_loss: 0.1791 | g_total_loss: 3.5380 Epoch [ 1890/ 4000] | d_X_loss: 0.2643 | d_Y_loss: 0.2098 | g_total_loss: 5.2291 Epoch [ 1900/ 4000] | d_X_loss: 0.2742 | d_Y_loss: 0.1931 | g_total_loss: 3.3093 Saved samples_cyclegan/sample-001900-X-Y.png Saved samples_cyclegan/sample-001900-Y-X.png Epoch [ 1910/ 4000] | d_X_loss: 0.3255 | d_Y_loss: 0.2252 | g_total_loss: 4.1855 Epoch [ 1920/ 4000] | d_X_loss: 0.2974 | d_Y_loss: 0.1735 | g_total_loss: 3.6111 Epoch [ 1930/ 4000] | d_X_loss: 0.3180 | d_Y_loss: 0.2527 | g_total_loss: 3.7145 Epoch [ 1940/ 4000] | d_X_loss: 0.2691 | d_Y_loss: 0.2199 | g_total_loss: 4.2059 Epoch [ 1950/ 4000] | d_X_loss: 0.3740 | d_Y_loss: 0.3629 | g_total_loss: 3.0767 Epoch [ 1960/ 4000] | d_X_loss: 0.3498 | d_Y_loss: 0.2249 | g_total_loss: 4.7678 Epoch [ 1970/ 4000] | d_X_loss: 0.1883 | d_Y_loss: 0.1720 | g_total_loss: 4.1730 Epoch [ 1980/ 4000] | d_X_loss: 0.3587 | d_Y_loss: 0.1563 | g_total_loss: 4.9254 Epoch [ 1990/ 4000] | d_X_loss: 0.3028 | d_Y_loss: 0.2737 | g_total_loss: 3.6923 Epoch [ 2000/ 4000] | d_X_loss: 0.2326 | d_Y_loss: 0.2510 | g_total_loss: 4.0155 Saved samples_cyclegan/sample-002000-X-Y.png Saved samples_cyclegan/sample-002000-Y-X.png Epoch [ 2010/ 4000] | d_X_loss: 0.3032 | d_Y_loss: 0.2023 | g_total_loss: 4.4918 Epoch [ 2020/ 4000] | d_X_loss: 0.1963 | d_Y_loss: 0.2506 | g_total_loss: 4.5897 Epoch [ 2030/ 4000] | d_X_loss: 0.2473 | d_Y_loss: 0.2489 | g_total_loss: 4.0731 Epoch [ 2040/ 4000] | d_X_loss: 0.2041 | d_Y_loss: 0.1640 | g_total_loss: 4.1864 Epoch [ 2050/ 4000] | d_X_loss: 0.2439 | d_Y_loss: 0.2661 | g_total_loss: 3.7701 Epoch [ 2060/ 4000] | d_X_loss: 0.2194 | d_Y_loss: 0.1643 | g_total_loss: 3.6900 Epoch [ 2070/ 4000] | d_X_loss: 0.2717 | d_Y_loss: 0.2902 | g_total_loss: 4.0376 Epoch [ 2080/ 4000] | d_X_loss: 0.2790 | d_Y_loss: 0.1770 | g_total_loss: 4.6142 Epoch [ 2090/ 4000] | d_X_loss: 0.2530 | d_Y_loss: 0.2468 | g_total_loss: 3.6647 Epoch [ 2100/ 4000] | d_X_loss: 0.3336 | d_Y_loss: 0.2750 | g_total_loss: 4.5134 Saved samples_cyclegan/sample-002100-X-Y.png Saved samples_cyclegan/sample-002100-Y-X.png Epoch [ 2110/ 4000] | d_X_loss: 0.2688 | d_Y_loss: 0.1741 | g_total_loss: 4.2551 Epoch [ 2120/ 4000] | d_X_loss: 0.2168 | d_Y_loss: 0.1680 | g_total_loss: 5.2604 Epoch [ 2130/ 4000] | d_X_loss: 0.2588 | d_Y_loss: 0.3086 | g_total_loss: 5.2141 Epoch [ 2140/ 4000] | d_X_loss: 1.1018 | d_Y_loss: 0.2042 | g_total_loss: 2.9630 Epoch [ 2150/ 4000] | d_X_loss: 0.2452 | d_Y_loss: 0.2129 | g_total_loss: 4.4640 Epoch [ 2160/ 4000] | d_X_loss: 0.2799 | d_Y_loss: 0.2164 | g_total_loss: 4.0459 Epoch [ 2170/ 4000] | d_X_loss: 0.2426 | d_Y_loss: 0.1698 | g_total_loss: 3.7974 Epoch [ 2180/ 4000] | d_X_loss: 0.3650 | d_Y_loss: 0.1941 | g_total_loss: 3.4241 Epoch [ 2190/ 4000] | d_X_loss: 0.2269 | d_Y_loss: 0.2949 | g_total_loss: 4.9501 Epoch [ 2200/ 4000] | d_X_loss: 0.2032 | d_Y_loss: 0.2354 | g_total_loss: 3.5302 Saved samples_cyclegan/sample-002200-X-Y.png Saved samples_cyclegan/sample-002200-Y-X.png Epoch [ 2210/ 4000] | d_X_loss: 0.2511 | d_Y_loss: 0.1932 | g_total_loss: 4.3024 Epoch [ 2220/ 4000] | d_X_loss: 0.2372 | d_Y_loss: 0.1912 | g_total_loss: 3.9535 Epoch [ 2230/ 4000] | d_X_loss: 0.2096 | d_Y_loss: 0.2804 | g_total_loss: 4.5337 Epoch [ 2240/ 4000] | d_X_loss: 0.2868 | d_Y_loss: 0.1142 | g_total_loss: 4.2860 Epoch [ 2250/ 4000] | d_X_loss: 0.2170 | d_Y_loss: 0.1721 | g_total_loss: 3.9165 Epoch [ 2260/ 4000] | d_X_loss: 0.2236 | d_Y_loss: 0.1512 | g_total_loss: 4.4272 Epoch [ 2270/ 4000] | d_X_loss: 0.2206 | d_Y_loss: 0.1238 | g_total_loss: 3.7345 Epoch [ 2280/ 4000] | d_X_loss: 0.2072 | d_Y_loss: 0.2947 | g_total_loss: 2.9466 Epoch [ 2290/ 4000] | d_X_loss: 0.3428 | d_Y_loss: 0.1982 | g_total_loss: 3.8461 Epoch [ 2300/ 4000] | d_X_loss: 0.2436 | d_Y_loss: 0.1498 | g_total_loss: 3.9561 Saved samples_cyclegan/sample-002300-X-Y.png Saved samples_cyclegan/sample-002300-Y-X.png Epoch [ 2310/ 4000] | d_X_loss: 0.2902 | d_Y_loss: 0.2342 | g_total_loss: 3.8383 Epoch [ 2320/ 4000] | d_X_loss: 0.2796 | d_Y_loss: 0.2179 | g_total_loss: 4.4457 Epoch [ 2330/ 4000] | d_X_loss: 0.2091 | d_Y_loss: 0.3622 | g_total_loss: 3.3021 Epoch [ 2340/ 4000] | d_X_loss: 0.2410 | d_Y_loss: 0.2246 | g_total_loss: 4.2316 Epoch [ 2350/ 4000] | d_X_loss: 0.3125 | d_Y_loss: 0.1924 | g_total_loss: 4.1430 Epoch [ 2360/ 4000] | d_X_loss: 0.1987 | d_Y_loss: 0.1597 | g_total_loss: 4.5396 Epoch [ 2370/ 4000] | d_X_loss: 0.2255 | d_Y_loss: 0.1740 | g_total_loss: 3.7733 Epoch [ 2380/ 4000] | d_X_loss: 0.3904 | d_Y_loss: 0.1687 | g_total_loss: 3.1419 Epoch [ 2390/ 4000] | d_X_loss: 0.1815 | d_Y_loss: 0.2071 | g_total_loss: 4.7013 Epoch [ 2400/ 4000] | d_X_loss: 0.2674 | d_Y_loss: 0.2049 | g_total_loss: 4.4468 Saved samples_cyclegan/sample-002400-X-Y.png Saved samples_cyclegan/sample-002400-Y-X.png Epoch [ 2410/ 4000] | d_X_loss: 0.3390 | d_Y_loss: 0.2999 | g_total_loss: 4.7052 Epoch [ 2420/ 4000] | d_X_loss: 0.3919 | d_Y_loss: 0.1226 | g_total_loss: 4.1392 Epoch [ 2430/ 4000] | d_X_loss: 0.2567 | d_Y_loss: 0.1687 | g_total_loss: 4.0582 Epoch [ 2440/ 4000] | d_X_loss: 0.2470 | d_Y_loss: 0.5112 | g_total_loss: 5.8621 Epoch [ 2450/ 4000] | d_X_loss: 0.2489 | d_Y_loss: 0.2190 | g_total_loss: 3.9979 Epoch [ 2460/ 4000] | d_X_loss: 0.2051 | d_Y_loss: 0.1511 | g_total_loss: 3.8372 Epoch [ 2470/ 4000] | d_X_loss: 0.2389 | d_Y_loss: 0.1378 | g_total_loss: 3.8707 Epoch [ 2480/ 4000] | d_X_loss: 0.2338 | d_Y_loss: 0.1582 | g_total_loss: 3.7847 Epoch [ 2490/ 4000] | d_X_loss: 0.3016 | d_Y_loss: 0.1276 | g_total_loss: 4.8564 Epoch [ 2500/ 4000] | d_X_loss: 0.2886 | d_Y_loss: 0.2735 | g_total_loss: 4.7964 Saved samples_cyclegan/sample-002500-X-Y.png Saved samples_cyclegan/sample-002500-Y-X.png Epoch [ 2510/ 4000] | d_X_loss: 0.2541 | d_Y_loss: 0.1899 | g_total_loss: 3.5550 Epoch [ 2520/ 4000] | d_X_loss: 0.2279 | d_Y_loss: 0.1843 | g_total_loss: 4.4447 Epoch [ 2530/ 4000] | d_X_loss: 0.3426 | d_Y_loss: 0.1744 | g_total_loss: 3.8378 Epoch [ 2540/ 4000] | d_X_loss: 0.2202 | d_Y_loss: 0.1715 | g_total_loss: 3.2370 Epoch [ 2550/ 4000] | d_X_loss: 0.1619 | d_Y_loss: 0.1729 | g_total_loss: 4.5417 Epoch [ 2560/ 4000] | d_X_loss: 0.2503 | d_Y_loss: 0.4097 | g_total_loss: 3.4391 Epoch [ 2570/ 4000] | d_X_loss: 0.1655 | d_Y_loss: 0.1431 | g_total_loss: 4.1548 Epoch [ 2580/ 4000] | d_X_loss: 0.1654 | d_Y_loss: 0.1637 | g_total_loss: 3.9657 Epoch [ 2590/ 4000] | d_X_loss: 0.1727 | d_Y_loss: 0.1063 | g_total_loss: 3.6826 Epoch [ 2600/ 4000] | d_X_loss: 0.2200 | d_Y_loss: 0.1501 | g_total_loss: 3.7266 Saved samples_cyclegan/sample-002600-X-Y.png Saved samples_cyclegan/sample-002600-Y-X.png Epoch [ 2610/ 4000] | d_X_loss: 0.1886 | d_Y_loss: 0.1938 | g_total_loss: 3.3696 Epoch [ 2620/ 4000] | d_X_loss: 0.2725 | d_Y_loss: 0.0985 | g_total_loss: 3.4911 Epoch [ 2630/ 4000] | d_X_loss: 0.2149 | d_Y_loss: 0.1486 | g_total_loss: 3.5930 Epoch [ 2640/ 4000] | d_X_loss: 0.1617 | d_Y_loss: 0.1663 | g_total_loss: 4.7570 Epoch [ 2650/ 4000] | d_X_loss: 0.1862 | d_Y_loss: 0.1999 | g_total_loss: 4.2615 Epoch [ 2660/ 4000] | d_X_loss: 0.2618 | d_Y_loss: 0.1563 | g_total_loss: 3.2289 Epoch [ 2670/ 4000] | d_X_loss: 0.3090 | d_Y_loss: 0.2034 | g_total_loss: 5.3066 Epoch [ 2680/ 4000] | d_X_loss: 0.2491 | d_Y_loss: 0.1294 | g_total_loss: 3.5480 Epoch [ 2690/ 4000] | d_X_loss: 0.1753 | d_Y_loss: 0.2536 | g_total_loss: 3.7334 Epoch [ 2700/ 4000] | d_X_loss: 0.1680 | d_Y_loss: 0.1466 | g_total_loss: 4.7342 Saved samples_cyclegan/sample-002700-X-Y.png Saved samples_cyclegan/sample-002700-Y-X.png Epoch [ 2710/ 4000] | d_X_loss: 0.2538 | d_Y_loss: 0.1211 | g_total_loss: 4.8353 Epoch [ 2720/ 4000] | d_X_loss: 0.0996 | d_Y_loss: 0.1633 | g_total_loss: 4.1933 Epoch [ 2730/ 4000] | d_X_loss: 0.1955 | d_Y_loss: 0.3256 | g_total_loss: 4.7507 Epoch [ 2740/ 4000] | d_X_loss: 0.2422 | d_Y_loss: 0.1407 | g_total_loss: 3.7285 Epoch [ 2750/ 4000] | d_X_loss: 0.1927 | d_Y_loss: 0.1648 | g_total_loss: 3.9994 Epoch [ 2760/ 4000] | d_X_loss: 0.3790 | d_Y_loss: 0.1709 | g_total_loss: 5.3863 Epoch [ 2770/ 4000] | d_X_loss: 0.1408 | d_Y_loss: 0.0896 | g_total_loss: 3.9541 Epoch [ 2780/ 4000] | d_X_loss: 0.1292 | d_Y_loss: 0.1419 | g_total_loss: 4.3423 Epoch [ 2790/ 4000] | d_X_loss: 0.2249 | d_Y_loss: 0.1125 | g_total_loss: 3.5974 Epoch [ 2800/ 4000] | d_X_loss: 0.1269 | d_Y_loss: 0.1301 | g_total_loss: 3.5439 Saved samples_cyclegan/sample-002800-X-Y.png Saved samples_cyclegan/sample-002800-Y-X.png Epoch [ 2810/ 4000] | d_X_loss: 0.2377 | d_Y_loss: 0.2841 | g_total_loss: 4.5682 Epoch [ 2820/ 4000] | d_X_loss: 0.1758 | d_Y_loss: 0.1675 | g_total_loss: 3.8426 Epoch [ 2830/ 4000] | d_X_loss: 0.2347 | d_Y_loss: 0.1477 | g_total_loss: 3.8532 Epoch [ 2840/ 4000] | d_X_loss: 0.1106 | d_Y_loss: 0.1322 | g_total_loss: 4.5131 Epoch [ 2850/ 4000] | d_X_loss: 0.1688 | d_Y_loss: 0.1082 | g_total_loss: 4.6591 Epoch [ 2860/ 4000] | d_X_loss: 0.1864 | d_Y_loss: 0.1272 | g_total_loss: 4.3846 Epoch [ 2870/ 4000] | d_X_loss: 0.2201 | d_Y_loss: 0.1917 | g_total_loss: 3.8976 Epoch [ 2880/ 4000] | d_X_loss: 0.2517 | d_Y_loss: 0.1159 | g_total_loss: 3.9381 Epoch [ 2890/ 4000] | d_X_loss: 0.1806 | d_Y_loss: 0.1301 | g_total_loss: 3.8789 Epoch [ 2900/ 4000] | d_X_loss: 0.1654 | d_Y_loss: 0.1572 | g_total_loss: 3.3393 Saved samples_cyclegan/sample-002900-X-Y.png Saved samples_cyclegan/sample-002900-Y-X.png Epoch [ 2910/ 4000] | d_X_loss: 0.0870 | d_Y_loss: 0.1557 | g_total_loss: 4.2132 Epoch [ 2920/ 4000] | d_X_loss: 0.2226 | d_Y_loss: 0.1152 | g_total_loss: 4.7609 Epoch [ 2930/ 4000] | d_X_loss: 0.3079 | d_Y_loss: 0.1482 | g_total_loss: 4.6334 Epoch [ 2940/ 4000] | d_X_loss: 0.1700 | d_Y_loss: 0.1071 | g_total_loss: 5.0426 Epoch [ 2950/ 4000] | d_X_loss: 0.2182 | d_Y_loss: 0.1710 | g_total_loss: 3.3385 Epoch [ 2960/ 4000] | d_X_loss: 0.3085 | d_Y_loss: 0.1758 | g_total_loss: 3.8001 Epoch [ 2970/ 4000] | d_X_loss: 0.1698 | d_Y_loss: 0.1228 | g_total_loss: 4.2666 Epoch [ 2980/ 4000] | d_X_loss: 0.2651 | d_Y_loss: 0.0903 | g_total_loss: 4.9024 Epoch [ 2990/ 4000] | d_X_loss: 0.2132 | d_Y_loss: 0.2027 | g_total_loss: 4.7811 Epoch [ 3000/ 4000] | d_X_loss: 0.2283 | d_Y_loss: 0.1506 | g_total_loss: 3.9421 Saved samples_cyclegan/sample-003000-X-Y.png Saved samples_cyclegan/sample-003000-Y-X.png Epoch [ 3010/ 4000] | d_X_loss: 0.1785 | d_Y_loss: 0.0923 | g_total_loss: 4.1808 Epoch [ 3020/ 4000] | d_X_loss: 0.1670 | d_Y_loss: 0.2868 | g_total_loss: 3.8953 Epoch [ 3030/ 4000] | d_X_loss: 0.1161 | d_Y_loss: 0.1311 | g_total_loss: 5.3665 Epoch [ 3040/ 4000] | d_X_loss: 0.1252 | d_Y_loss: 0.1173 | g_total_loss: 4.1214 Epoch [ 3050/ 4000] | d_X_loss: 0.2958 | d_Y_loss: 0.1573 | g_total_loss: 4.5007 Epoch [ 3060/ 4000] | d_X_loss: 0.2829 | d_Y_loss: 0.0892 | g_total_loss: 4.2669 Epoch [ 3070/ 4000] | d_X_loss: 0.2496 | d_Y_loss: 0.0957 | g_total_loss: 4.1041 Epoch [ 3080/ 4000] | d_X_loss: 0.1539 | d_Y_loss: 0.2016 | g_total_loss: 3.4195 Epoch [ 3090/ 4000] | d_X_loss: 0.1698 | d_Y_loss: 0.1879 | g_total_loss: 3.6788 Epoch [ 3100/ 4000] | d_X_loss: 0.1325 | d_Y_loss: 0.1203 | g_total_loss: 4.1985 Saved samples_cyclegan/sample-003100-X-Y.png Saved samples_cyclegan/sample-003100-Y-X.png Epoch [ 3110/ 4000] | d_X_loss: 0.1814 | d_Y_loss: 0.0717 | g_total_loss: 3.6764 Epoch [ 3120/ 4000] | d_X_loss: 0.2054 | d_Y_loss: 0.1058 | g_total_loss: 3.5443 Epoch [ 3130/ 4000] | d_X_loss: 0.1768 | d_Y_loss: 0.1225 | g_total_loss: 3.6631 Epoch [ 3140/ 4000] | d_X_loss: 0.1612 | d_Y_loss: 0.1082 | g_total_loss: 4.1504 Epoch [ 3150/ 4000] | d_X_loss: 0.1872 | d_Y_loss: 0.1128 | g_total_loss: 3.7797 Epoch [ 3160/ 4000] | d_X_loss: 0.1664 | d_Y_loss: 0.2579 | g_total_loss: 4.8262 Epoch [ 3170/ 4000] | d_X_loss: 0.2603 | d_Y_loss: 0.1617 | g_total_loss: 4.3776 Epoch [ 3180/ 4000] | d_X_loss: 0.1878 | d_Y_loss: 0.0953 | g_total_loss: 3.8488 Epoch [ 3190/ 4000] | d_X_loss: 0.1712 | d_Y_loss: 0.0904 | g_total_loss: 4.3706 Epoch [ 3200/ 4000] | d_X_loss: 0.1596 | d_Y_loss: 0.1179 | g_total_loss: 3.5950 Saved samples_cyclegan/sample-003200-X-Y.png Saved samples_cyclegan/sample-003200-Y-X.png Epoch [ 3210/ 4000] | d_X_loss: 0.1859 | d_Y_loss: 0.0787 | g_total_loss: 4.1705 Epoch [ 3220/ 4000] | d_X_loss: 0.1898 | d_Y_loss: 0.1567 | g_total_loss: 4.6342 Epoch [ 3230/ 4000] | d_X_loss: 0.1824 | d_Y_loss: 0.1415 | g_total_loss: 4.2547 Epoch [ 3240/ 4000] | d_X_loss: 0.1255 | d_Y_loss: 0.1042 | g_total_loss: 3.9501 Epoch [ 3250/ 4000] | d_X_loss: 0.1713 | d_Y_loss: 0.1484 | g_total_loss: 3.4231 Epoch [ 3260/ 4000] | d_X_loss: 0.1651 | d_Y_loss: 0.1301 | g_total_loss: 4.4806 Epoch [ 3270/ 4000] | d_X_loss: 0.1338 | d_Y_loss: 0.1153 | g_total_loss: 4.5857 Epoch [ 3280/ 4000] | d_X_loss: 0.1198 | d_Y_loss: 0.2741 | g_total_loss: 5.7033 Epoch [ 3290/ 4000] | d_X_loss: 0.1646 | d_Y_loss: 0.1138 | g_total_loss: 4.9019 Epoch [ 3300/ 4000] | d_X_loss: 0.1783 | d_Y_loss: 0.1159 | g_total_loss: 4.5697 Saved samples_cyclegan/sample-003300-X-Y.png Saved samples_cyclegan/sample-003300-Y-X.png Epoch [ 3310/ 4000] | d_X_loss: 0.1351 | d_Y_loss: 0.1042 | g_total_loss: 4.0004 Epoch [ 3320/ 4000] | d_X_loss: 0.1388 | d_Y_loss: 0.1409 | g_total_loss: 3.6784 Epoch [ 3330/ 4000] | d_X_loss: 0.2916 | d_Y_loss: 0.1355 | g_total_loss: 5.0286 Epoch [ 3340/ 4000] | d_X_loss: 0.1453 | d_Y_loss: 0.0953 | g_total_loss: 3.9553 Epoch [ 3350/ 4000] | d_X_loss: 0.1948 | d_Y_loss: 0.0886 | g_total_loss: 3.4559 Epoch [ 3360/ 4000] | d_X_loss: 0.0950 | d_Y_loss: 0.1227 | g_total_loss: 4.4487 Epoch [ 3370/ 4000] | d_X_loss: 0.1001 | d_Y_loss: 0.1183 | g_total_loss: 4.4962 Epoch [ 3380/ 4000] | d_X_loss: 0.1401 | d_Y_loss: 0.1332 | g_total_loss: 3.6945 Epoch [ 3390/ 4000] | d_X_loss: 0.1563 | d_Y_loss: 0.0999 | g_total_loss: 4.2952 Epoch [ 3400/ 4000] | d_X_loss: 0.1569 | d_Y_loss: 0.1528 | g_total_loss: 4.5010 Saved samples_cyclegan/sample-003400-X-Y.png Saved samples_cyclegan/sample-003400-Y-X.png Epoch [ 3410/ 4000] | d_X_loss: 0.1393 | d_Y_loss: 0.1328 | g_total_loss: 3.6430 Epoch [ 3420/ 4000] | d_X_loss: 0.2032 | d_Y_loss: 0.1012 | g_total_loss: 3.8178 Epoch [ 3430/ 4000] | d_X_loss: 0.2452 | d_Y_loss: 0.1207 | g_total_loss: 3.9782 Epoch [ 3440/ 4000] | d_X_loss: 0.1524 | d_Y_loss: 0.1371 | g_total_loss: 4.6533 Epoch [ 3450/ 4000] | d_X_loss: 0.1837 | d_Y_loss: 0.0820 | g_total_loss: 3.7113 Epoch [ 3460/ 4000] | d_X_loss: 0.1507 | d_Y_loss: 0.1392 | g_total_loss: 3.8481 Epoch [ 3470/ 4000] | d_X_loss: 0.1472 | d_Y_loss: 0.0944 | g_total_loss: 3.7422 Epoch [ 3480/ 4000] | d_X_loss: 0.1026 | d_Y_loss: 0.1668 | g_total_loss: 4.6097 Epoch [ 3490/ 4000] | d_X_loss: 0.1474 | d_Y_loss: 0.1140 | g_total_loss: 4.4020 Epoch [ 3500/ 4000] | d_X_loss: 0.2269 | d_Y_loss: 0.0840 | g_total_loss: 3.5287 Saved samples_cyclegan/sample-003500-X-Y.png Saved samples_cyclegan/sample-003500-Y-X.png Epoch [ 3510/ 4000] | d_X_loss: 0.1289 | d_Y_loss: 0.1068 | g_total_loss: 4.3754 Epoch [ 3520/ 4000] | d_X_loss: 0.2573 | d_Y_loss: 0.0829 | g_total_loss: 3.9122 Epoch [ 3530/ 4000] | d_X_loss: 0.1961 | d_Y_loss: 0.0839 | g_total_loss: 4.3736 Epoch [ 3540/ 4000] | d_X_loss: 0.1125 | d_Y_loss: 0.1386 | g_total_loss: 3.9568 Epoch [ 3550/ 4000] | d_X_loss: 0.1357 | d_Y_loss: 0.0808 | g_total_loss: 4.3151 Epoch [ 3560/ 4000] | d_X_loss: 0.1277 | d_Y_loss: 0.0988 | g_total_loss: 4.4753 Epoch [ 3570/ 4000] | d_X_loss: 0.2596 | d_Y_loss: 0.0937 | g_total_loss: 3.5377 Epoch [ 3580/ 4000] | d_X_loss: 0.2191 | d_Y_loss: 0.0863 | g_total_loss: 3.8304 Epoch [ 3590/ 4000] | d_X_loss: 0.1585 | d_Y_loss: 0.0917 | g_total_loss: 4.2170 Epoch [ 3600/ 4000] | d_X_loss: 0.1728 | d_Y_loss: 0.1891 | g_total_loss: 3.3996 Saved samples_cyclegan/sample-003600-X-Y.png Saved samples_cyclegan/sample-003600-Y-X.png Epoch [ 3610/ 4000] | d_X_loss: 0.1132 | d_Y_loss: 0.1160 | g_total_loss: 4.2138 Epoch [ 3620/ 4000] | d_X_loss: 0.1529 | d_Y_loss: 0.0983 | g_total_loss: 4.2615 Epoch [ 3630/ 4000] | d_X_loss: 0.1100 | d_Y_loss: 0.0762 | g_total_loss: 5.0378 Epoch [ 3640/ 4000] | d_X_loss: 0.1155 | d_Y_loss: 0.0849 | g_total_loss: 3.9693 Epoch [ 3650/ 4000] | d_X_loss: 0.1100 | d_Y_loss: 0.0999 | g_total_loss: 4.5041 Epoch [ 3660/ 4000] | d_X_loss: 0.8685 | d_Y_loss: 0.2292 | g_total_loss: 7.3380 Epoch [ 3670/ 4000] | d_X_loss: 5.6597 | d_Y_loss: 0.0652 | g_total_loss: 6.6813 Epoch [ 3680/ 4000] | d_X_loss: 0.5272 | d_Y_loss: 0.0855 | g_total_loss: 3.9980 Epoch [ 3690/ 4000] | d_X_loss: 0.4593 | d_Y_loss: 0.1158 | g_total_loss: 3.4508 Epoch [ 3700/ 4000] | d_X_loss: 0.4520 | d_Y_loss: 0.0850 | g_total_loss: 3.8726 Saved samples_cyclegan/sample-003700-X-Y.png Saved samples_cyclegan/sample-003700-Y-X.png Epoch [ 3710/ 4000] | d_X_loss: 0.4835 | d_Y_loss: 0.1403 | g_total_loss: 3.0122 Epoch [ 3720/ 4000] | d_X_loss: 0.4813 | d_Y_loss: 0.1356 | g_total_loss: 4.5557 Epoch [ 3730/ 4000] | d_X_loss: 0.4549 | d_Y_loss: 0.0554 | g_total_loss: 3.7894 Epoch [ 3740/ 4000] | d_X_loss: 0.4939 | d_Y_loss: 0.0855 | g_total_loss: 3.7894 Epoch [ 3750/ 4000] | d_X_loss: 0.5257 | d_Y_loss: 0.1092 | g_total_loss: 3.5162 Epoch [ 3760/ 4000] | d_X_loss: 0.4610 | d_Y_loss: 0.1392 | g_total_loss: 3.0089 Epoch [ 3770/ 4000] | d_X_loss: 0.4773 | d_Y_loss: 0.1103 | g_total_loss: 4.4122 Epoch [ 3780/ 4000] | d_X_loss: 0.4823 | d_Y_loss: 0.0650 | g_total_loss: 4.0688 Epoch [ 3790/ 4000] | d_X_loss: 0.4549 | d_Y_loss: 7.5054 | g_total_loss: 7.0633 Epoch [ 3800/ 4000] | d_X_loss: 0.4854 | d_Y_loss: 0.4875 | g_total_loss: 3.5842 Saved samples_cyclegan/sample-003800-X-Y.png Saved samples_cyclegan/sample-003800-Y-X.png Epoch [ 3810/ 4000] | d_X_loss: 0.4973 | d_Y_loss: 0.5255 | g_total_loss: 3.0981 Epoch [ 3820/ 4000] | d_X_loss: 0.4389 | d_Y_loss: 0.5273 | g_total_loss: 3.3052 Epoch [ 3830/ 4000] | d_X_loss: 0.4935 | d_Y_loss: 0.5281 | g_total_loss: 3.0967 Epoch [ 3840/ 4000] | d_X_loss: 0.4728 | d_Y_loss: 0.5430 | g_total_loss: 2.8479 Epoch [ 3850/ 4000] | d_X_loss: 0.4454 | d_Y_loss: 0.4561 | g_total_loss: 3.2698 Epoch [ 3860/ 4000] | d_X_loss: 0.4300 | d_Y_loss: 0.5028 | g_total_loss: 2.9159 Epoch [ 3870/ 4000] | d_X_loss: 0.4966 | d_Y_loss: 0.4839 | g_total_loss: 2.5131 Epoch [ 3880/ 4000] | d_X_loss: 0.4554 | d_Y_loss: 0.4816 | g_total_loss: 2.7783 Epoch [ 3890/ 4000] | d_X_loss: 0.4863 | d_Y_loss: 0.5263 | g_total_loss: 3.4355 Epoch [ 3900/ 4000] | d_X_loss: 0.4521 | d_Y_loss: 0.4907 | g_total_loss: 2.5306 Saved samples_cyclegan/sample-003900-X-Y.png Saved samples_cyclegan/sample-003900-Y-X.png Epoch [ 3910/ 4000] | d_X_loss: 0.4536 | d_Y_loss: 0.4365 | g_total_loss: 2.8774 Epoch [ 3920/ 4000] | d_X_loss: 0.4681 | d_Y_loss: 0.4440 | g_total_loss: 2.5940 Epoch [ 3930/ 4000] | d_X_loss: 0.4534 | d_Y_loss: 0.4301 | g_total_loss: 2.7251 Epoch [ 3940/ 4000] | d_X_loss: 0.4632 | d_Y_loss: 0.4114 | g_total_loss: 2.6697 Epoch [ 3950/ 4000] | d_X_loss: 0.4345 | d_Y_loss: 0.4898 | g_total_loss: 2.6657 Epoch [ 3960/ 4000] | d_X_loss: 0.4509 | d_Y_loss: 0.5059 | g_total_loss: 2.7177 Epoch [ 3970/ 4000] | d_X_loss: 0.4844 | d_Y_loss: 0.4546 | g_total_loss: 2.5827 Epoch [ 3980/ 4000] | d_X_loss: 0.5180 | d_Y_loss: 0.4715 | g_total_loss: 2.7952 Epoch [ 3990/ 4000] | d_X_loss: 0.4662 | d_Y_loss: 0.4012 | g_total_loss: 3.0120 Epoch [ 4000/ 4000] | d_X_loss: 0.4361 | d_Y_loss: 0.4799 | g_total_loss: 2.6911 Saved samples_cyclegan/sample-004000-X-Y.png Saved samples_cyclegan/sample-004000-Y-X.png
A lot of experimentation goes into finding the best hyperparameters such that the generators and discriminators don't overpower each other. It's often a good starting point to look at existing papers to find what has worked in previous experiments, I'd recommend this DCGAN paper in addition to the original CycleGAN paper to see what worked for them. Then, you can try your own experiments based off of a good foundation.
When you display the generator and discriminator losses you should see that there is always some discriminator loss; recall that we are trying to design a model that can generate good "fake" images. So, the ideal discriminator will not be able to tell the difference between real and fake images and, as such, will always have some loss. You should also see that $D_X$ and $D_Y$ are roughly at the same loss levels; if they are not, this indicates that your training is favoring one type of discriminator over the and you may need to look at biases in your models or data.
The generator's loss should start significantly higher than the discriminator losses because it is accounting for the loss of both generators and weighted reconstruction errors. You should see this loss decrease a lot at the start of training because initial, generated images are often far-off from being good fakes. After some time it may level off; this is normal since the generator and discriminator are both improving as they train. If you see that the loss is jumping around a lot, over time, you may want to try decreasing your learning rates or changing your cycle consistency loss to be a little more/less weighted.
fig, ax = plt.subplots(figsize=(12,8))
losses = np.array(losses)
plt.plot(losses.T[0], label='Discriminator, X', alpha=0.5)
plt.plot(losses.T[1], label='Discriminator, Y', alpha=0.5)
plt.plot(losses.T[2], label='Generators', alpha=0.5)
plt.title("Training Losses")
plt.legend()
<matplotlib.legend.Legend at 0x7fa8a80ff0b8>
As you trained this model, you may have chosen to sample and save the results of your generated images after a certain number of training iterations. This gives you a way to see whether or not your Generators are creating good fake images. For example, the image below depicts real images in the $Y$ set, and the corresponding generated images during different points in the training process. You can see that the generator starts out creating very noisy, fake images, but begins to converge to better representations as it trains (though, not perfect).

Below, you've been given a helper function for displaying generated samples based on the passed in training iteration.
import matplotlib.image as mpimg
# helper visualization code
def view_samples(iteration, sample_dir='samples_cyclegan'):
# samples are named by iteration
path_XtoY = os.path.join(sample_dir, 'sample-{:06d}-X-Y.png'.format(iteration))
path_YtoX = os.path.join(sample_dir, 'sample-{:06d}-Y-X.png'.format(iteration))
# read in those samples
try:
x2y = mpimg.imread(path_XtoY)
y2x = mpimg.imread(path_YtoX)
except:
print('Invalid number of iterations.')
fig, (ax1, ax2) = plt.subplots(figsize=(18,20), nrows=2, ncols=1, sharey=True, sharex=True)
ax1.imshow(x2y)
ax1.set_title('X to Y')
ax2.imshow(y2x)
ax2.set_title('Y to X')
# view samples at iteration 100
view_samples(100, 'samples_cyclegan')
# view samples at iteration 4000
view_samples(4000, 'samples_cyclegan')
Once you are satified with your model, you are ancouraged to test it on a different dataset to see if it can find different types of mappings!
You can download a variety of datasets used in the Pix2Pix and CycleGAN papers, by following instructions in the associated Github repository. You'll just need to make sure that the data directories are named and organized correctly to load in that data.